47 research outputs found

    Quantifying single nucleotide variant detection sensitivity in exome sequencing

    Get PDF
    BACKGROUND: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed. RESULTS: Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give “power estimates” for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5–15% of heterozygous and 1–4% of homozygous SNVs in the targeted regions will be missed. CONCLUSIONS: Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the “missing heritability” of quantitative traits

    Whole-genome array-CGH for detection of submicroscopic chromosomal imbalances in children with mental retardation

    Get PDF
    Chromosomal imbalances are the major cause of mental retardation (MR). Many of these imbalances are caused by submicroscopic deletions or duplications not detected by conventional cytogenetic methods. Microarray-based comparative genomic hybridization (array-CGH) is considered to be superior for the investigation of chromosomal aberrations in children with MR, and has been demonstrated to improve the diagnostic detection rate of these small chromosomal abnormalities. In this study we used 1 Mb genome-wide array-CGH to screen 48 children with MR and congenital malformations for submicroscopic chromosomal imbalances, where the underlying cause was unknown. All children were clinically investigated and subtelomere FISH analysis had been performed in all cases. Suspected microdeletion syndromes such as deletion 22q11.2, Williams-Beuren and Angelman syndromes were excluded before array-CGH analysis was performed. We identified de novo interstitial chromosomal imbalances in two patients (4%), and an interstitial deletion inherited from an affected mother in one patient (2%). In another two of the children (4%), suspected imbalances were detected but were also found in one of the non-affected parents. The yield of identified de novo alterations detected in this study is somewhat less than previously described, and might reflect the importance of which selection criterion of patients to be used before array-CGH analysis is performed. However, array-CGH proved to be a high-quality and reliable tool for genome-wide screening of MR patients of unknown etiology

    Array-CGH and breast cancer

    Get PDF
    The introduction of comparative genomic hybridization (CGH) in 1992 opened new avenues in genomic investigation; in particular, it advanced analysis of solid tumours, including breast cancer, because it obviated the need to culture cells before their chromosomes could be analyzed. The current generation of CGH analysis uses ordered arrays of genomic DNA sequences and is therefore referred to as array-CGH or matrix-CGH. It was introduced in 1998, and further increased the potential of CGH to provide insight into the fundamental processes of chromosomal instability and cancer. This review provides a critical evaluation of the data published on array-CGH and breast cancer, and discusses some of its expected future value and developments

    Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana

    Get PDF
    Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only based on first-order HMMs having constrained abilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. Here, we develop parsimonious higher-order HMMs enabling the interpolation between a mixture model ignoring spatial dependencies and a higher-order HMM exhaustively modeling spatial dependencies. We apply parsimonious higher-order HMMs to the analysis of Array-CGH data of the accessions C24 and Col-0 of the model plant Arabidopsis thaliana. We compare these models against first-order HMMs and other existing methods using a reference of known deletions and sequence deviations. We find that parsimonious higher-order HMMs clearly improve the identification of these polymorphisms. Moreover, we perform a functional analysis of identified polymorphisms revealing novel details of genomic differences between C24 and Col-0. Additional model evaluations are done on widely considered Array-CGH data of human cell lines indicating that parsimonious HMMs are also well-suited for the analysis of non-plant specific data. All these results indicate that parsimonious higher-order HMMs are useful for Array-CGH analyses. An implementation of parsimonious higher-order HMMs is available as part of the open source Java library Jstacs (www.jstacs.de/index.php/PHHMM)

    Characterising chromosome rearrangements: recent technical advances in molecular cytogenetics

    Get PDF
    Genomic rearrangements can result in losses, amplifications, translocations and inversions of DNA fragments thereby modifying genome architecture, and potentially having clinical consequences. Many genomic disorders caused by structural variation have initially been uncovered by early cytogenetic methods. The last decade has seen significant progression in molecular cytogenetic techniques, allowing rapid and precise detection of structural rearrangements on a whole-genome scale. The high resolution attainable with these recently developed techniques has also uncovered the role of structural variants in normal genetic variation alongside single-nucleotide polymorphisms (SNPs). We describe how array-based comparative genomic hybridisation, SNP arrays, array painting and next-generation sequencing analytical methods (read depth, read pair and split read) allow the extensive characterisation of chromosome rearrangements in human genomes

    High-resolution DNA copy number profiling of malignant peripheral nerve sheath tumors using targeted microarray-based comparative genomic hybridization

    No full text
    Purpose: Neurofibromatosis type 1 (NF1) is an autosomal dominant condition that predisposes to benign and malignant tumors. The lifetime risk of a malignant peripheral nerve sheath tumor (MPNST) in NF1 is ∼10%. These tumors have a poor survival rate and their molecular basis remains unclear. We report the first comprehensive investigation of DNA copy number across multitude of genes in NF1 tumors using high-resolution array comparative genomic hybridization (CGH), with the aim to identify molecular signatures that delineate malignant from benign NF1 tumors. Experimental Design: We constructed an exon-level resolution microarray encompassing 57 selected genes and profiled DNA from 35 MPNSTs, 16 plexiform, and 8 dermal neurofibromas. Bioinformatic analysis was done on array CGH data to identify concurrent aberrations in malignant tumors. Results: The array CGH profiles of MPNSTs and neurofibromas were markedly different. A number of MPNST-specific alterations were identified, including amplifications of ITGB4, PDGFRA, MET, TP73, and HGF plus deletions in NF1, HMMR/RHAMM, MMP13, L1CAM2, p16INK4A/CDKN2A, and TP53. Copy number changes of HMMR/RHAMM, MMP13, p16INK4A/CDKN2A, and ITGB4 were observed in 46%, 43%, 39%, and 32%, respectively of the malignant tumors, implicating these genes in MPNST pathogenesis. Concomitant amplifications of HGF, MET, and PDGFRA genes were also revealed in MPNSTs, suggesting the putative role of p70S6K pathway in NF1 tumor progression. Conclusions: This study highlights the potential of array CGH in identifying novel diagnostic markers for MPNSTs
    corecore